1 Problem statement and selected dataset

This project seeks to develop, train and evaluate a statistical model that helps end-users make predictions about whether or not there will be rainfall the following day, given the weather conditions on a given day in Australia. In addition, several multivariate techniques will be implemented in order to extract key insights and relevant information from the available historical data. A better understanding of the factors influencing rainfall in Australia and how these may have changed over time given the harsh climate changes the territory has experienced in the last decade may be of use to predict future droughts and wildfire crises.

The main dataset was obtained from the repository at Kaggle.com. This dataset is built using publicly available climate data, provided by the Australian Bureau of Meteorology and measured by weather stations distributed across the country. It contains more than 100,000 daily weather observations over a 10-year period from 2007 to 2017 from 49 unique locations in Australia. These observations include temperature, rainfall, atmospheric pressure, evaporation, humidity, wind direction, and wind speed at different times during the day. More specifically, the dataset contains 24 columns in total. Columns 2 through 22 are defined by the Australian Government Bureau of Meteorology:

  1. ID: identification number of the observation.
  2. Date: date of the observation.
  3. Location: common name for the weather station location.
  4. MinTemp: minimum temperature in the 24 hours to 9am, degrees Celsius.
  5. MaxTemp: maximum temperature in the 24 hours from 9am, degrees Celsius.
  6. Rainfall: rainfall in the 24 hours to 9am, millimeters.
  7. Evaporation: “Class A” pan evaporation in the 24 hours to 9am, millimeters.
  8. Sunshine: number of hours of bright sunshine in the 24 hours to midnight.
  9. WindGustDir: direction of the strongest wind gust in the 24 hours to midnight. Wind directions can be labeled as: N (North), NNE (North-Northeast), NE (Northeast), ENE (East-Northeast), E (East), ESE (East-Southeast), SE (Southeast), SSE (South-Southeast), S (South), SSW (South-Southwest), SW (Southwest), WSW (West-Southwest), W (West), WNW (West-Northwest), NW (Northwest), NNW (North-Northwest).
  10. WindGustSpeed: speed of strongest wind gust in the 24 hours to midnight, km/h.
  11. WindDir9am: wind direction at 9am, measured with 16 compass points.
  12. WindDir3pm: wind direction at 3pm, measured with 16 compass points.
  13. WindSpeed9am: average wind speed over the 10-minute period prior to 9am, km/h.
  14. WindSpeed3pm: average wind speed over the 10-minute period prior to 3pm, km/h.
  15. Humidity9am: relative humidity percentage at 9am.
  16. Humidity3pm: relative humidity percentage at 3pm.
  17. Pressure9am: atmospheric pressure (hpa) reduced to mean sea level at 9am.
  18. Pressure3pm: atmospheric pressure (hpa) reduced to mean sea level at 3pm.
  19. Cloud9am: fraction of sky obscured by cloud at 9am, measured in oktas, a unit of eights that describes the amount of cloud cover at any given location such as a weather station, ranging from 0 (completely clear sky) to 8 (completely covered sky).
  20. Cloud3pm: fraction of sky obscured by cloud at 3pm, measured in oktas.
  21. Temp9am: temperature at 9am, measured in degree celsius.
  22. Temp3pm: temperature at 3pm, measured in degree celsius.
  23. RainToday: boolean variable; Yes if precipitation (mm) in the 24 hours to 9am exceeds 1 mm, otherwise No. 
  24. RainTomorrow: boolean variable; Yes if the following day precipitation exceeds 1 mm, otherwise No. 

The dataset selected covers the cities presented in the map:



#dirshpcities<-paste0(dirname(current_path),"/Cities.shp")
shapecities <- st_read(paste0(dirname(current_path),"/Cities.shp"))
shaperegions <- st_read(paste0(dirname(current_path),"/GCCSA_2021_AUST_GDA2020.shp"))
ggplot(data = shaperegions) +
    geom_sf(aes(fill = '#d6604d')) + 
    geom_sf(data = shapecities, size = 3, shape = 19, fill = "#4d4d4d")

Table 1 shows a small subset of the data, including only selected observations and attributes. .


# Loading data set

rain_data <- read.csv("weatherAUSOriginal.csv", stringsAsFactors=TRUE)
str(rain_data)
'data.frame':   145460 obs. of  23 variables:
 $ Date         : Factor w/ 3436 levels "2007-11-01","2007-11-02",..: 397 398 399 400 401 402 403 404 405 406 ...
 $ Location     : Factor w/ 49 levels "Adelaide","Albany",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ MinTemp      : num  13.4 7.4 12.9 9.2 17.5 14.6 14.3 7.7 9.7 13.1 ...
 $ MaxTemp      : num  22.9 25.1 25.7 28 32.3 29.7 25 26.7 31.9 30.1 ...
 $ Rainfall     : num  0.6 0 0 0 1 0.2 0 0 0 1.4 ...
 $ Evaporation  : num  NA NA NA NA NA NA NA NA NA NA ...
 $ Sunshine     : num  NA NA NA NA NA NA NA NA NA NA ...
 $ WindGustDir  : Factor w/ 16 levels "E","ENE","ESE",..: 14 15 16 5 14 15 14 14 7 14 ...
 $ WindGustSpeed: int  44 44 46 24 41 56 50 35 80 28 ...
 $ WindDir9am   : Factor w/ 16 levels "E","ENE","ESE",..: 14 7 14 10 2 14 13 11 10 9 ...
 $ WindDir3pm   : Factor w/ 16 levels "E","ENE","ESE",..: 15 16 16 1 8 14 14 14 8 11 ...
 $ WindSpeed9am : int  20 4 19 11 7 19 20 6 7 15 ...
 $ WindSpeed3pm : int  24 22 26 9 20 24 24 17 28 11 ...
 $ Humidity9am  : int  71 44 38 45 82 55 49 48 42 58 ...
 $ Humidity3pm  : int  22 25 30 16 33 23 19 19 9 27 ...
 $ Pressure9am  : num  1008 1011 1008 1018 1011 ...
 $ Pressure3pm  : num  1007 1008 1009 1013 1006 ...
 $ Cloud9am     : int  8 NA NA NA 7 NA 1 NA NA NA ...
 $ Cloud3pm     : int  NA NA 2 NA 8 NA NA NA NA NA ...
 $ Temp9am      : num  16.9 17.2 21 18.1 17.8 20.6 18.1 16.3 18.3 20.1 ...
 $ Temp3pm      : num  21.8 24.3 23.2 26.5 29.7 28.9 24.6 25.5 30.2 28.2 ...
 $ RainToday    : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 2 ...
 $ RainTomorrow : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 2 1 ...
rain_data$Date <- as.Date(rain_data$Date)

df <- subset(rain_data, Date < median(rain_data$Date) & Location %in%    c("Sydney","AliceSprings","Brisbane","Cairns","Perth","Moree"))

datatable(df,options = list(pageLength = 10))
Warning in instance$preRenderHook(instance) :
  It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.html
Warning in instance$preRenderHook(instance) :
  It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.html
#datatable(nmcols[c(1,5,8,11,12,14,17,23:26,29,31,38:43,45,48,51,54,57,63:66,70:72),], colnames=c("Nro.","Inicial","Nombre en R","Detalle (por vereda)"),options = list(pageLength = 5))

2 Data preparation

A different project team in Group 12 chose the same data set, so this team will work with the first half of the data set, divided by date, as instructed by the lab professor on 27-Sep-2021.

We will use the median date to split the data set and keep the first half.


missing_stats <- colSums(is.na(df))*100/nrow(df)
ID = c(1:nrow(df))
df <- add_column(df, ID, .before=1)

# Graph
vis_miss(df)


aggr(df, col=c('grey','#252525'), numbers=TRUE, sortVars=TRUE, labels=names(df), cex.axis=.5, gap=1, ylab=c("Missing data"," "),border=NA)
Warning in plot.aggr(res, ...) :
  not enough vertical space to display frequencies (too many combinations)

 Variables sorted by number of missings: 

3 Detecting Missing Values

#Display a table that shows the number of rows with a certain number of NA's
mis_ind = rowSums(is.na(df))
table(mis_ind)
mis_ind
   0    1    2    3    4    5    6    7    8    9   16 
7622  515 1550   93   76   11   26    8    6    1    1 
#Plot the distribution of the number of NA's per row
hist(mis_ind)

In order to build the model, observations should be removed that are missing too much data. Observations that have NA values above the 90th percentile of the NA count distribution should be removed.

#Check 90th percentile of missing data and save value to variable
rm_NA <- quantile(mis_ind,0.90)

We should remove observations with more than 6 NA values.

#Creates an index and data frame of observations that have missing data
#above the cut-off threshold.
m1 <- which(mis_ind > rm_NA)
df_remove1 <- df[m1,]
#Removes observations with too many NA's from data frame used for modeling
df <- df[-c(m1),]

#Summarizes the number of NA's per variable.
mis_col = colSums(is.na(df))
mis_col
           ID          Date      Location       MinTemp       MaxTemp      Rainfall 
            0             0             0             0             1             8 
  Evaporation      Sunshine   WindGustDir WindGustSpeed    WindDir9am    WindDir3pm 
           46           130           962           961           292            26 
 WindSpeed9am  WindSpeed3pm   Humidity9am   Humidity3pm   Pressure9am   Pressure3pm 
            3             1            19            17             8             5 
     Cloud9am      Cloud3pm       Temp9am       Temp3pm     RainToday  RainTomorrow 
          529           588             1             1             8             9 
#Check number of unique locations and summary
num_unique_locations <- length(unique(df$Location))
summary(df$Location)
        Adelaide           Albany           Albury     AliceSprings    BadgerysCreek 
               0                0                0             1530                0 
        Ballarat          Bendigo         Brisbane           Cairns         Canberra 
               0                0             1685             1553                0 
           Cobar     CoffsHarbour         Dartmoor           Darwin        GoldCoast 
               0                0                0                0                0 
          Hobart        Katherine       Launceston        Melbourne MelbourneAirport 
               0                0                0                0                0 
         Mildura            Moree     MountGambier      MountGinini        Newcastle 
               0             1479                0                0                0 
            Nhil        NorahHead    NorfolkIsland        Nuriootpa       PearceRAAF 
               0                0                0                0                0 
         Penrith            Perth     PerthAirport         Portland         Richmond 
               0             1707                0                0                0 
            Sale       SalmonGums           Sydney    SydneyAirport       Townsville 
               0                0             1733                0                0 
     Tuggeranong            Uluru       WaggaWagga          Walpole         Watsonia 
               0                0                0                0                0 
     Williamtown      Witchcliffe       Wollongong          Woomera 
               0                0                0                0 
plot(df$Location)

#Check number of observations below the 10% quantile 
rm_Loc <- quantile(summary(df$Location),0.1)

Locations with less than 1101 observations will be removed since they are below the 10th percentile of number of recorded observations.

#Creates an index and data frame of Locations that have missing data
#below the cut-off threshold.
m2 <- which(summary(df$Location) < rm_Loc)
loc_remove <- levels(df$Location)[c(m2)]
df_remove2 <- subset(df,Location %in% c(loc_remove))
#REMOVE OBSERVATIONS FROM SPECIFIC LOCATIONS
df <- subset(df,!Location %in% c(loc_remove))

5 locations were removed from the data set because of too few observations. They were Katherine, MountGinini, Newcastle, Nhil, and Uluru.

#CHECK NUMBER OF UNIQUE DATES AND TIME SPAN
num_unique_dates <- length(unique(df$Date))
time_difference <- as.numeric((max(df$Date)-min(df$Date)), units="days")
total_years <- time_difference/365

There are weather observations for 1951 unique dates across 44 unique locations over the time span of approximately 5.5 years.

#CALCULATE NUMBER OF MISSING DATES
num_missing_dates <- time_difference - num_unique_dates

There are 1951 unique dates in the data set, however there is a difference of 2039 days between the first observation and the last observation, which means there are 88 dates missing in the time span.

nums <- unlist(lapply(df, is.numeric)) 
numrain_data<-df[ , nums]
numrain_data<-subset(numrain_data,complete.cases(numrain_data))
a<-names(numrain_data)
a<-as.list(a)






fun02<-function(i){index=grep(i,names(numrain_data))
                   bw <- nclass.Sturges(numrain_data[,index]) # Freedman-Diaconis
                   nm=paste0(i)
                   assign(paste("g",i,sep=""),
                   ggplot(numrain_data, aes(numrain_data[,index])) +  
                   geom_histogram(bins = bw,aes(y=..density..), fill="#de2d26") +
                   geom_density(alpha=.35, fill="#08519c",color = NA)  +
                   geom_vline (aes(xintercept=median(numrain_data[,index])),color="#08519c", size=1) + 
                   labs(title=nm, x=NULL, y="UPAS")) +
                   theme(plot.title = element_text(size = rel(0.7),face ="bold",hjust = 0.5),
                        axis.title.y = element_text(size = rel(0.4)),
                        axis.text = element_text(size = rel(0.4)))
                   }
Histos<-lapply(a[-1],fun02)
do.call(grid.arrange, Histos)

Preliminary correlation

# Initial correlation matrix
cor_org<-cor(numrain_data)

# Interactive correlation matrix
heatmaply(round(cor_org, 2),symm=TRUE,colors="RdGy",revC=TRUE,dendrogram="none",cexRow=0.6,cexCol=0.6, main="Correlation matrix")

4 References

LS0tDQp0aXRsZTogIkdyb3VwIDE6IE11bHRpdmFyaWF0ZSBhbmFseXNpcyBvZiBhdXN0cmFsaWFuIGNsaW1hdGUgZGF0YSINCmF1dGhvcjogIkFuZHJlYSBJZ2xlc2lhcyBNdW5pbGxhLCBLYXRocnluIFdlaXNzbWFuLCBEaWFuYSBHYWxpbmRvIEdvbnrDoWxleiwgTWF0ZW8gSsOhY29tZSBHb256w6FsZXogeSBQZWRybyBHb256w6FsZXogUHJhZG8iDQpkYXRlOiAiYHIgU3lzLkRhdGUoKWAiDQpvdXRwdXQ6DQogIGh0bWxfbm90ZWJvb2s6DQogICB0b2M6IHRydWUNCiAgIHRvY19kZXB0aDogMg0KICAgdG9jX2Zsb2F0OiB0cnVlDQogICB0aGVtZTogY2VydWxlYW4NCiAgIGhpZ2hsaWdodDogdGFuZ28NCiAgICNjb2xsYXBzZWQ6IGZhbHNlDQogICAjc21vb3RoX3Njcm9sbDogZmFsc2UNCiAgIG51bWJlcl9zZWN0aW9uczogdHJ1ZQ0KICAgDQojdGhlbWU6IHJlYWRhYmxlDQojaGlnaGxpZ2h0OiBzdWNjZXNzICNodHRwczovL2Jvb3Rzd2F0Y2guY29tLzMvDQojc3VidGl0bGU6IFJlcG9ydA0KLS0tDQoNCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQ0Kcm0obGlzdD1scyhhbGw9VFJVRSkpDQoNCiMgUmVxdWlyZWQgcGFja2FnZXMNCnBrZ3M8LWMoInJzdHVkaW9hcGkiLCJ0aWR5dmVyc2UiLCJEVCIsIm5hbmlhciIsInRpZHlyIiwic2YiLCJnZ3Bsb3QiLCJnZ3Bsb3QyIiwiY293cGxvdCIsICJnb29nbGV3YXkiLCAiZ2dwbG90MiIsICJnZ3JlcGVsIiwgIlZJTSIsImdnc3BhdGlhbCIsICJsaWJ3Z2VvbSIsICJzZiIsICJybmF0dXJhbGVhcnRoIiwgInJuYXR1cmFsZWFydGhkYXRhIiwiZ3JpZEV4dHJhIiwiZ3JpZCIsImdncGxvdDIiLCJsYXR0aWNlIiwiRmFjdG9NaW5lUiIsImZhY3RvZXh0cmEiLCJjb3JwbG90IiwiaGVhdG1hcGx5IikNCg0KIyBOb24taW5zdGFsbGVkIHBhY2thZ2VzDQppbnNwa2dzPC1wa2dzWyFwa2dzICVpbiUgaW5zdGFsbGVkLnBhY2thZ2VzKCldDQpmb3IobGlicyBpbiBpbnNwa2dzKSBpbnN0YWxsLnBhY2thZ2VzKGxpYnMpDQoNCiMgTG9hZGluZyByZXF1aXJlZA0Kc2FwcGx5KHBrZ3MscmVxdWlyZSxjaGFyYWN0ZXI9VFJVRSkNCg0KIygiYWRlNCIsImNvcnJwbG90IiwiZmFjdG9leHRyYSIsIkZhY3RvTWluZVIiLCJmb3JlaWduIiwiZ2dwbG90MiIsImdyaWRFeHRyYSIsIkhtaXNjIiwiUkNvbG9yQnJld2VyIiwicmVzaGFwZTIiLCJSUG9zdGdyZVNRTCIsImtuaXRyIiwib3Blbnhsc3giLCJOYkNsdXN0IiwiRFQiLCJkM2hlYXRtYXAiLCJoZWF0bWFwbHkiLCJzZiIsInZpcmlkaXMiLCJsZWFmbGV0IiwicGFuZGVyIiwiVklNIiwicGxvdGx5IikNCg0KIyBTZXR0aW5nIHdvcmtzcGFjZQ0KDQpjdXJyZW50X3BhdGggPC0gZ2V0QWN0aXZlRG9jdW1lbnRDb250ZXh0KCkkcGF0aCANCnNldHdkKGRpcm5hbWUoY3VycmVudF9wYXRoICkpDQoNCmBgYA0KDQojIFByb2JsZW0gc3RhdGVtZW50IGFuZCBzZWxlY3RlZCBkYXRhc2V0IA0KDQpUaGlzIHByb2plY3Qgc2Vla3MgdG8gZGV2ZWxvcCwgdHJhaW4gYW5kIGV2YWx1YXRlIGEgc3RhdGlzdGljYWwgbW9kZWwgdGhhdCBoZWxwcyBlbmQtdXNlcnMgbWFrZSBwcmVkaWN0aW9ucyBhYm91dCB3aGV0aGVyIG9yIG5vdCB0aGVyZSB3aWxsIGJlIHJhaW5mYWxsIHRoZSBmb2xsb3dpbmcgZGF5LCBnaXZlbiB0aGUgd2VhdGhlciBjb25kaXRpb25zIG9uIGEgZ2l2ZW4gZGF5IGluIEF1c3RyYWxpYS4gSW4gYWRkaXRpb24sIHNldmVyYWwgbXVsdGl2YXJpYXRlIHRlY2huaXF1ZXMgd2lsbCBiZSBpbXBsZW1lbnRlZCBpbiBvcmRlciB0byBleHRyYWN0IGtleSBpbnNpZ2h0cyBhbmQgcmVsZXZhbnQgaW5mb3JtYXRpb24gZnJvbSB0aGUgYXZhaWxhYmxlIGhpc3RvcmljYWwgZGF0YS4gQSBiZXR0ZXIgdW5kZXJzdGFuZGluZyBvZiB0aGUgZmFjdG9ycyBpbmZsdWVuY2luZyByYWluZmFsbCBpbiBBdXN0cmFsaWEgYW5kIGhvdyB0aGVzZSBtYXkgaGF2ZSBjaGFuZ2VkIG92ZXIgdGltZSBnaXZlbiB0aGUgaGFyc2ggY2xpbWF0ZSBjaGFuZ2VzIHRoZSB0ZXJyaXRvcnkgaGFzIGV4cGVyaWVuY2VkIGluIHRoZSBsYXN0IGRlY2FkZSBtYXkgYmUgb2YgdXNlIHRvIHByZWRpY3QgZnV0dXJlIGRyb3VnaHRzIGFuZCB3aWxkZmlyZSBjcmlzZXMuIA0KDQpUaGUgbWFpbiBkYXRhc2V0IHdhcyBvYnRhaW5lZCBmcm9tIHRoZSByZXBvc2l0b3J5IGF0IFtLYWdnbGUuY29tXVsxXS4gVGhpcyBkYXRhc2V0IGlzIGJ1aWx0IHVzaW5nIHB1YmxpY2x5IGF2YWlsYWJsZSBjbGltYXRlIGRhdGEsIHByb3ZpZGVkIGJ5IHRoZSBbQXVzdHJhbGlhbiBCdXJlYXUgb2YgTWV0ZW9yb2xvZ3ldWzJdIGFuZCBtZWFzdXJlZCBieSB3ZWF0aGVyIHN0YXRpb25zIGRpc3RyaWJ1dGVkIGFjcm9zcyB0aGUgY291bnRyeS4gSXQgY29udGFpbnMgbW9yZSB0aGFuIDEwMCwwMDAgZGFpbHkgd2VhdGhlciBvYnNlcnZhdGlvbnMgb3ZlciBhIDEwLXllYXIgcGVyaW9kIGZyb20gMjAwNyB0byAyMDE3IGZyb20gNDkgdW5pcXVlIGxvY2F0aW9ucyBpbiBBdXN0cmFsaWEuIFRoZXNlIG9ic2VydmF0aW9ucyBpbmNsdWRlIHRlbXBlcmF0dXJlLCByYWluZmFsbCwgYXRtb3NwaGVyaWMgcHJlc3N1cmUsIGV2YXBvcmF0aW9uLCBodW1pZGl0eSwgd2luZCBkaXJlY3Rpb24sIGFuZCB3aW5kIHNwZWVkIGF0IGRpZmZlcmVudCB0aW1lcyBkdXJpbmcgdGhlIGRheS4gIE1vcmUgc3BlY2lmaWNhbGx5LCB0aGUgZGF0YXNldCBjb250YWlucyAyNCBjb2x1bW5zIGluIHRvdGFsLiBDb2x1bW5zIDIgdGhyb3VnaCAyMiBhcmUgZGVmaW5lZCBieSB0aGUgW0F1c3RyYWxpYW4gR292ZXJubWVudCBCdXJlYXUgb2YgTWV0ZW9yb2xvZ3ldWzJdOiANCg0KDQoxLiAqKklEKio6IGlkZW50aWZpY2F0aW9uIG51bWJlciBvZiB0aGUgb2JzZXJ2YXRpb24uDQoyLiAqKkRhdGUqKjogZGF0ZSBvZiB0aGUgb2JzZXJ2YXRpb24uIA0KMy4gKipMb2NhdGlvbioqOiBjb21tb24gbmFtZSBmb3IgdGhlIHdlYXRoZXIgc3RhdGlvbiBsb2NhdGlvbi4gDQo0LiAqKk1pblRlbXAqKjogbWluaW11bSB0ZW1wZXJhdHVyZSBpbiB0aGUgMjQgaG91cnMgdG8gOWFtLCBkZWdyZWVzIENlbHNpdXMuDQo1LiAqKk1heFRlbXAqKjogbWF4aW11bSB0ZW1wZXJhdHVyZSBpbiB0aGUgMjQgaG91cnMgZnJvbSA5YW0sIGRlZ3JlZXMgQ2Vsc2l1cy4NCjYuICoqUmFpbmZhbGwqKjogcmFpbmZhbGwgaW4gdGhlIDI0IGhvdXJzIHRvIDlhbSwgbWlsbGltZXRlcnMuIA0KNy4gKipFdmFwb3JhdGlvbioqOiAiQ2xhc3MgQSIgcGFuIGV2YXBvcmF0aW9uIGluIHRoZSAyNCBob3VycyB0byA5YW0sIG1pbGxpbWV0ZXJzLg0KOC4gKipTdW5zaGluZSoqOiBudW1iZXIgb2YgaG91cnMgb2YgYnJpZ2h0IHN1bnNoaW5lIGluIHRoZSAyNCBob3VycyB0byBtaWRuaWdodC4NCjkuICoqV2luZEd1c3REaXIqKjogZGlyZWN0aW9uIG9mIHRoZSBzdHJvbmdlc3Qgd2luZCBndXN0IGluIHRoZSAyNCBob3VycyB0byBtaWRuaWdodC4gV2luZCBkaXJlY3Rpb25zIGNhbiBiZSBsYWJlbGVkIGFzOiBOIChOb3J0aCksIE5ORSAoTm9ydGgtTm9ydGhlYXN0KSwgTkUgKE5vcnRoZWFzdCksIEVORSAoRWFzdC1Ob3J0aGVhc3QpLCBFIChFYXN0KSwgRVNFIChFYXN0LVNvdXRoZWFzdCksIFNFIChTb3V0aGVhc3QpLCBTU0UgKFNvdXRoLVNvdXRoZWFzdCksIFMgKFNvdXRoKSwgU1NXIChTb3V0aC1Tb3V0aHdlc3QpLCBTVyAoU291dGh3ZXN0KSwgV1NXIChXZXN0LVNvdXRod2VzdCksIFcgKFdlc3QpLCBXTlcgKFdlc3QtTm9ydGh3ZXN0KSwgTlcgKE5vcnRod2VzdCksIE5OVyAoTm9ydGgtTm9ydGh3ZXN0KS4gIA0KMTAuICoqV2luZEd1c3RTcGVlZCoqOiBzcGVlZCBvZiBzdHJvbmdlc3Qgd2luZCBndXN0IGluIHRoZSAyNCBob3VycyB0byBtaWRuaWdodCwga20vaC4NCjExLiAqKldpbmREaXI5YW0qKjogd2luZCBkaXJlY3Rpb24gIGF0IDlhbSwgbWVhc3VyZWQgd2l0aCAxNiBjb21wYXNzIHBvaW50cy4gDQoxMi4gKipXaW5kRGlyM3BtKio6IHdpbmQgZGlyZWN0aW9uIGF0IDNwbSwgbWVhc3VyZWQgd2l0aCAxNiBjb21wYXNzIHBvaW50cy4gDQoxMy4gKipXaW5kU3BlZWQ5YW0qKjogYXZlcmFnZSB3aW5kIHNwZWVkIG92ZXIgdGhlIDEwLW1pbnV0ZSBwZXJpb2QgcHJpb3IgdG8gOWFtLCBrbS9oLg0KMTQuICoqV2luZFNwZWVkM3BtKio6IGF2ZXJhZ2Ugd2luZCBzcGVlZCBvdmVyIHRoZSAxMC1taW51dGUgcGVyaW9kIHByaW9yIHRvIDNwbSwga20vaC4NCjE1LiAqKkh1bWlkaXR5OWFtKio6IHJlbGF0aXZlIGh1bWlkaXR5IHBlcmNlbnRhZ2UgYXQgOWFtLiANCjE2LiAqKkh1bWlkaXR5M3BtKio6IHJlbGF0aXZlIGh1bWlkaXR5IHBlcmNlbnRhZ2UgYXQgM3BtLg0KMTcuICoqUHJlc3N1cmU5YW0qKjogIGF0bW9zcGhlcmljIHByZXNzdXJlIChocGEpIHJlZHVjZWQgdG8gbWVhbiBzZWEgbGV2ZWwgYXQgOWFtLg0KMTguICoqUHJlc3N1cmUzcG0qKjogYXRtb3NwaGVyaWMgcHJlc3N1cmUgKGhwYSkgcmVkdWNlZCB0byBtZWFuIHNlYSBsZXZlbCBhdCAzcG0uDQoxOS4gKipDbG91ZDlhbSoqOiBmcmFjdGlvbiBvZiBza3kgb2JzY3VyZWQgYnkgY2xvdWQgYXQgOWFtLCBtZWFzdXJlZCBpbiBva3RhcywgYSB1bml0IG9mIGVpZ2h0cyB0aGF0IGRlc2NyaWJlcyB0aGUgYW1vdW50IG9mIGNsb3VkIGNvdmVyIGF0IGFueSBnaXZlbiBsb2NhdGlvbiBzdWNoIGFzIGEgd2VhdGhlciBzdGF0aW9uLCByYW5naW5nIGZyb20gMCAoY29tcGxldGVseSBjbGVhciBza3kpIHRvIDggKGNvbXBsZXRlbHkgY292ZXJlZCBza3kpLiANCjIwLiAqKkNsb3VkM3BtKio6IGZyYWN0aW9uIG9mIHNreSBvYnNjdXJlZCBieSBjbG91ZCBhdCAzcG0sIG1lYXN1cmVkIGluIG9rdGFzLg0KMjEuICoqVGVtcDlhbSoqOiB0ZW1wZXJhdHVyZSBhdCA5YW0sIG1lYXN1cmVkIGluIGRlZ3JlZSBjZWxzaXVzLiANCjIyLiAqKlRlbXAzcG0qKjogdGVtcGVyYXR1cmUgYXQgM3BtLCBtZWFzdXJlZCBpbiBkZWdyZWUgY2Vsc2l1cy4NCjIzLiAqKlJhaW5Ub2RheSoqOiBib29sZWFuIHZhcmlhYmxlOyBZZXMgaWYgcHJlY2lwaXRhdGlvbiAobW0pIGluIHRoZSAyNCBob3VycyB0byA5YW0gZXhjZWVkcyAxIG1tLCBvdGhlcndpc2UgTm8uIA0KMjQuICoqUmFpblRvbW9ycm93Kio6IGJvb2xlYW4gdmFyaWFibGU7IFllcyBpZiB0aGUgZm9sbG93aW5nIGRheSBwcmVjaXBpdGF0aW9uIGV4Y2VlZHMgMSBtbSwgb3RoZXJ3aXNlIE5vLiANCg0KDQpUaGUgZGF0YXNldCBzZWxlY3RlZCBjb3ZlcnMgdGhlIGNpdGllcyBwcmVzZW50ZWQgaW4gdGhlIG1hcDoNCg0KYGBge3IgbWVzc2FnZT1GQUxTRSwgcmVzdWx0cz0iaGlkZSIsIGZpZy5hbGlnbiA9ICdjZW50ZXInfQ0KDQoNCiNkaXJzaHBjaXRpZXM8LXBhc3RlMChkaXJuYW1lKGN1cnJlbnRfcGF0aCksIi9DaXRpZXMuc2hwIikNCnNoYXBlY2l0aWVzIDwtIHN0X3JlYWQocGFzdGUwKGRpcm5hbWUoY3VycmVudF9wYXRoKSwiL0NpdGllcy5zaHAiKSkNCnNoYXBlcmVnaW9ucyA8LSBzdF9yZWFkKHBhc3RlMChkaXJuYW1lKGN1cnJlbnRfcGF0aCksIi9HQ0NTQV8yMDIxX0FVU1RfR0RBMjAyMC5zaHAiKSkNCg0KZ2dwbG90KGRhdGEgPSBzaGFwZXJlZ2lvbnMpICsNCiAgICBnZW9tX3NmKGFlcyhmaWxsID0gJyNkNjYwNGQnKSkgKyANCiAgICBnZW9tX3NmKGRhdGEgPSBzaGFwZWNpdGllcywgc2l6ZSA9IDMsIHNoYXBlID0gMTksIGZpbGwgPSAiIzRkNGQ0ZCIpDQoNCmBgYA0KDQpUYWJsZSAxIHNob3dzIGEgc21hbGwgc3Vic2V0IG9mIHRoZSBkYXRhLCBpbmNsdWRpbmcgb25seSBzZWxlY3RlZCBvYnNlcnZhdGlvbnMgYW5kIGF0dHJpYnV0ZXMuIC4gDQoNCmBgYHtyIGRhdGF9DQoNCiMgTG9hZGluZyBkYXRhIHNldA0KDQpyYWluX2RhdGEgPC0gcmVhZC5jc3YoIndlYXRoZXJBVVNPcmlnaW5hbC5jc3YiLCBzdHJpbmdzQXNGYWN0b3JzPVRSVUUpDQpzdHIocmFpbl9kYXRhKQ0KcmFpbl9kYXRhJERhdGUgPC0gYXMuRGF0ZShyYWluX2RhdGEkRGF0ZSkNCg0KZGYgPC0gc3Vic2V0KHJhaW5fZGF0YSwgRGF0ZSA8IG1lZGlhbihyYWluX2RhdGEkRGF0ZSkgJiBMb2NhdGlvbiAlaW4lICAgIGMoIlN5ZG5leSIsIkFsaWNlU3ByaW5ncyIsIkJyaXNiYW5lIiwiQ2Fpcm5zIiwiUGVydGgiLCJNb3JlZSIpKQ0KDQpkYXRhdGFibGUoZGYsb3B0aW9ucyA9IGxpc3QocGFnZUxlbmd0aCA9IDEwKSkNCiNkYXRhdGFibGUobm1jb2xzW2MoMSw1LDgsMTEsMTIsMTQsMTcsMjM6MjYsMjksMzEsMzg6NDMsNDUsNDgsNTEsNTQsNTcsNjM6NjYsNzA6NzIpLF0sIGNvbG5hbWVzPWMoIk5yby4iLCJJbmljaWFsIiwiTm9tYnJlIGVuIFIiLCJEZXRhbGxlIChwb3IgdmVyZWRhKSIpLG9wdGlvbnMgPSBsaXN0KHBhZ2VMZW5ndGggPSA1KSkNCmBgYA0KDQojIERhdGEgcHJlcGFyYXRpb24NCg0KQSBkaWZmZXJlbnQgcHJvamVjdCB0ZWFtIGluIEdyb3VwIDEyIGNob3NlIHRoZSBzYW1lIGRhdGEgc2V0LCBzbyB0aGlzIHRlYW0gd2lsbCB3b3JrIHdpdGggdGhlIGZpcnN0IGhhbGYgb2YgdGhlIGRhdGEgc2V0LCBkaXZpZGVkIGJ5IGRhdGUsIGFzIGluc3RydWN0ZWQgYnkgdGhlIGxhYiBwcm9mZXNzb3Igb24gMjctU2VwLTIwMjEuDQoNCldlIHdpbGwgdXNlIHRoZSBtZWRpYW4gZGF0ZSB0byBzcGxpdCB0aGUgZGF0YSBzZXQgYW5kIGtlZXAgdGhlIGZpcnN0IGhhbGYuDQoNCmBgYHtyIG1pc3NpbmdkYXRhZ3JhcGhzfQ0KDQptaXNzaW5nX3N0YXRzIDwtIGNvbFN1bXMoaXMubmEoZGYpKSoxMDAvbnJvdyhkZikNCklEID0gYygxOm5yb3coZGYpKQ0KZGYgPC0gYWRkX2NvbHVtbihkZiwgSUQsIC5iZWZvcmU9MSkNCg0KIyBHcmFwaA0KdmlzX21pc3MoZGYpDQoNCmFnZ3IoZGYsIGNvbD1jKCdncmV5JywnIzI1MjUyNScpLCBudW1iZXJzPVRSVUUsIHNvcnRWYXJzPVRSVUUsIGxhYmVscz1uYW1lcyhkZiksIGNleC5heGlzPS41LCBnYXA9MSwgeWxhYj1jKCJNaXNzaW5nIGRhdGEiLCIgIiksYm9yZGVyPU5BKQ0KDQpgYGANCg0KIyBEZXRlY3RpbmcgTWlzc2luZyBWYWx1ZXMNCmBgYHtyfQ0KI0Rpc3BsYXkgYSB0YWJsZSB0aGF0IHNob3dzIHRoZSBudW1iZXIgb2Ygcm93cyB3aXRoIGEgY2VydGFpbiBudW1iZXIgb2YgTkEncw0KbWlzX2luZCA9IHJvd1N1bXMoaXMubmEoZGYpKQ0KdGFibGUobWlzX2luZCkNCmBgYA0KYGBge3J9DQojUGxvdCB0aGUgZGlzdHJpYnV0aW9uIG9mIHRoZSBudW1iZXIgb2YgTkEncyBwZXIgcm93DQpoaXN0KG1pc19pbmQpDQpgYGANCkluIG9yZGVyIHRvIGJ1aWxkIHRoZSBtb2RlbCwgb2JzZXJ2YXRpb25zIHNob3VsZCBiZSByZW1vdmVkIHRoYXQgYXJlIG1pc3NpbmcgdG9vIG11Y2ggZGF0YS4gT2JzZXJ2YXRpb25zIHRoYXQgaGF2ZSBOQSB2YWx1ZXMgYWJvdmUgdGhlIDkwdGggcGVyY2VudGlsZSBvZiB0aGUgTkEgY291bnQgZGlzdHJpYnV0aW9uIHNob3VsZCBiZSByZW1vdmVkLg0KDQpgYGB7cn0NCiNDaGVjayA5MHRoIHBlcmNlbnRpbGUgb2YgbWlzc2luZyBkYXRhIGFuZCBzYXZlIHZhbHVlIHRvIHZhcmlhYmxlDQpybV9OQSA8LSBxdWFudGlsZShtaXNfaW5kLDAuOTApDQpgYGANCg0KV2Ugc2hvdWxkIHJlbW92ZSBvYnNlcnZhdGlvbnMgd2l0aCBtb3JlIHRoYW4gNiBOQSB2YWx1ZXMuDQoNCmBgYHtyfQ0KI0NyZWF0ZXMgYW4gaW5kZXggYW5kIGRhdGEgZnJhbWUgb2Ygb2JzZXJ2YXRpb25zIHRoYXQgaGF2ZSBtaXNzaW5nIGRhdGENCiNhYm92ZSB0aGUgY3V0LW9mZiB0aHJlc2hvbGQuDQptMSA8LSB3aGljaChtaXNfaW5kID4gcm1fTkEpDQpkZl9yZW1vdmUxIDwtIGRmW20xLF0NCiNSZW1vdmVzIG9ic2VydmF0aW9ucyB3aXRoIHRvbyBtYW55IE5BJ3MgZnJvbSBkYXRhIGZyYW1lIHVzZWQgZm9yIG1vZGVsaW5nDQpkZiA8LSBkZlstYyhtMSksXQ0KDQojU3VtbWFyaXplcyB0aGUgbnVtYmVyIG9mIE5BJ3MgcGVyIHZhcmlhYmxlLg0KbWlzX2NvbCA9IGNvbFN1bXMoaXMubmEoZGYpKQ0KbWlzX2NvbA0KDQojQ2hlY2sgbnVtYmVyIG9mIHVuaXF1ZSBsb2NhdGlvbnMgYW5kIHN1bW1hcnkNCm51bV91bmlxdWVfbG9jYXRpb25zIDwtIGxlbmd0aCh1bmlxdWUoZGYkTG9jYXRpb24pKQ0Kc3VtbWFyeShkZiRMb2NhdGlvbikNCnBsb3QoZGYkTG9jYXRpb24pDQojQ2hlY2sgbnVtYmVyIG9mIG9ic2VydmF0aW9ucyBiZWxvdyB0aGUgMTAlIHF1YW50aWxlIA0Kcm1fTG9jIDwtIHF1YW50aWxlKHN1bW1hcnkoZGYkTG9jYXRpb24pLDAuMSkNCmBgYA0KDQpMb2NhdGlvbnMgd2l0aCBsZXNzIHRoYW4gMTEwMSBvYnNlcnZhdGlvbnMgd2lsbCBiZSByZW1vdmVkIHNpbmNlIHRoZXkgYXJlIGJlbG93IHRoZSAxMHRoIHBlcmNlbnRpbGUgb2YgbnVtYmVyIG9mIHJlY29yZGVkIG9ic2VydmF0aW9ucy4NCg0KYGBge3J9DQojQ3JlYXRlcyBhbiBpbmRleCBhbmQgZGF0YSBmcmFtZSBvZiBMb2NhdGlvbnMgdGhhdCBoYXZlIG1pc3NpbmcgZGF0YQ0KI2JlbG93IHRoZSBjdXQtb2ZmIHRocmVzaG9sZC4NCm0yIDwtIHdoaWNoKHN1bW1hcnkoZGYkTG9jYXRpb24pIDwgcm1fTG9jKQ0KbG9jX3JlbW92ZSA8LSBsZXZlbHMoZGYkTG9jYXRpb24pW2MobTIpXQ0KZGZfcmVtb3ZlMiA8LSBzdWJzZXQoZGYsTG9jYXRpb24gJWluJSBjKGxvY19yZW1vdmUpKQ0KI1JFTU9WRSBPQlNFUlZBVElPTlMgRlJPTSBTUEVDSUZJQyBMT0NBVElPTlMNCmRmIDwtIHN1YnNldChkZiwhTG9jYXRpb24gJWluJSBjKGxvY19yZW1vdmUpKQ0KYGBgDQo1IGxvY2F0aW9ucyB3ZXJlIHJlbW92ZWQgZnJvbSB0aGUgZGF0YSBzZXQgYmVjYXVzZSBvZiB0b28gZmV3IG9ic2VydmF0aW9ucy4gVGhleSB3ZXJlIEthdGhlcmluZSwgTW91bnRHaW5pbmksIE5ld2Nhc3RsZSwgTmhpbCwgYW5kIFVsdXJ1Lg0KDQpgYGB7cn0NCiNDSEVDSyBOVU1CRVIgT0YgVU5JUVVFIERBVEVTIEFORCBUSU1FIFNQQU4NCm51bV91bmlxdWVfZGF0ZXMgPC0gbGVuZ3RoKHVuaXF1ZShkZiREYXRlKSkNCnRpbWVfZGlmZmVyZW5jZSA8LSBhcy5udW1lcmljKChtYXgoZGYkRGF0ZSktbWluKGRmJERhdGUpKSwgdW5pdHM9ImRheXMiKQ0KdG90YWxfeWVhcnMgPC0gdGltZV9kaWZmZXJlbmNlLzM2NQ0KYGBgDQpUaGVyZSBhcmUgd2VhdGhlciBvYnNlcnZhdGlvbnMgZm9yIDE5NTEgdW5pcXVlIGRhdGVzIGFjcm9zcyA0NCB1bmlxdWUgbG9jYXRpb25zICBvdmVyIHRoZSB0aW1lIHNwYW4gb2YgYXBwcm94aW1hdGVseSA1LjUgeWVhcnMuDQoNCmBgYHtyfQ0KI0NBTENVTEFURSBOVU1CRVIgT0YgTUlTU0lORyBEQVRFUw0KbnVtX21pc3NpbmdfZGF0ZXMgPC0gdGltZV9kaWZmZXJlbmNlIC0gbnVtX3VuaXF1ZV9kYXRlcw0KYGBgDQoNClRoZXJlIGFyZSAxOTUxIHVuaXF1ZSBkYXRlcyBpbiB0aGUgZGF0YSBzZXQsIGhvd2V2ZXIgdGhlcmUgaXMgYSBkaWZmZXJlbmNlIG9mIDIwMzkgZGF5cyBiZXR3ZWVuIHRoZSBmaXJzdCBvYnNlcnZhdGlvbiBhbmQgdGhlIGxhc3Qgb2JzZXJ2YXRpb24sIHdoaWNoIG1lYW5zIHRoZXJlIGFyZSA4OCBkYXRlcyBtaXNzaW5nIGluIHRoZSB0aW1lIHNwYW4uDQoNCg0KYGBge3IgLCBtZXNzYWdlPUZBTFNFLCByZXN1bHRzPSdhc2lzJyxmaWcuYWxpZ24gPSAnY2VudGVyJ30NCm51bXMgPC0gdW5saXN0KGxhcHBseShkZiwgaXMubnVtZXJpYykpIA0KbnVtcmFpbl9kYXRhPC1kZlsgLCBudW1zXQ0KbnVtcmFpbl9kYXRhPC1zdWJzZXQobnVtcmFpbl9kYXRhLGNvbXBsZXRlLmNhc2VzKG51bXJhaW5fZGF0YSkpDQphPC1uYW1lcyhudW1yYWluX2RhdGEpDQphPC1hcy5saXN0KGEpDQoNCg0KDQoNCg0KDQpmdW4wMjwtZnVuY3Rpb24oaSl7aW5kZXg9Z3JlcChpLG5hbWVzKG51bXJhaW5fZGF0YSkpDQogICAgICAgICAgICAgICAgICAgYncgPC0gbmNsYXNzLlN0dXJnZXMobnVtcmFpbl9kYXRhWyxpbmRleF0pICMgRnJlZWRtYW4tRGlhY29uaXMNCiAgICAgICAgICAgICAgICAgICBubT1wYXN0ZTAoaSkNCiAgICAgICAgICAgICAgICAgICBhc3NpZ24ocGFzdGUoImciLGksc2VwPSIiKSwNCiAgICAgICAgICAgICAgICAgICBnZ3Bsb3QobnVtcmFpbl9kYXRhLCBhZXMobnVtcmFpbl9kYXRhWyxpbmRleF0pKSArICANCiAgICAgICAgICAgICAgICAgICBnZW9tX2hpc3RvZ3JhbShiaW5zID0gYncsYWVzKHk9Li5kZW5zaXR5Li4pLCBmaWxsPSIjZGUyZDI2IikgKw0KICAgICAgICAgICAgICAgICAgIGdlb21fZGVuc2l0eShhbHBoYT0uMzUsIGZpbGw9IiMwODUxOWMiLGNvbG9yID0gTkEpICArDQogICAgICAgICAgICAgICAgICAgZ2VvbV92bGluZSAoYWVzKHhpbnRlcmNlcHQ9bWVkaWFuKG51bXJhaW5fZGF0YVssaW5kZXhdKSksY29sb3I9IiMwODUxOWMiLCBzaXplPTEpICsgDQogICAgICAgICAgICAgICAgICAgbGFicyh0aXRsZT1ubSwgeD1OVUxMLCB5PSJVUEFTIikpICsNCiAgICAgICAgICAgICAgICAgICB0aGVtZShwbG90LnRpdGxlID0gZWxlbWVudF90ZXh0KHNpemUgPSByZWwoMC43KSxmYWNlID0iYm9sZCIsaGp1c3QgPSAwLjUpLA0KICAgICAgICAgICAgICAgICAgICAgICAgYXhpcy50aXRsZS55ID0gZWxlbWVudF90ZXh0KHNpemUgPSByZWwoMC40KSksDQogICAgICAgICAgICAgICAgICAgICAgICBheGlzLnRleHQgPSBlbGVtZW50X3RleHQoc2l6ZSA9IHJlbCgwLjQpKSkNCiAgICAgICAgICAgICAgICAgICB9DQpIaXN0b3M8LWxhcHBseShhWy0xXSxmdW4wMikNCmRvLmNhbGwoZ3JpZC5hcnJhbmdlLCBIaXN0b3MpDQoNCmBgYA0KUHJlbGltaW5hcnkgY29ycmVsYXRpb24NCg0KYGBge3IgLCBtZXNzYWdlPUZBTFNFLGZpZy5hbGlnbiA9ICdjZW50ZXInfQ0KIyBJbml0aWFsIGNvcnJlbGF0aW9uIG1hdHJpeA0KY29yX29yZzwtY29yKG51bXJhaW5fZGF0YSkNCg0KIyBJbnRlcmFjdGl2ZSBjb3JyZWxhdGlvbiBtYXRyaXgNCmhlYXRtYXBseShyb3VuZChjb3Jfb3JnLCAyKSxzeW1tPVRSVUUsY29sb3JzPSJSZEd5IixyZXZDPVRSVUUsZGVuZHJvZ3JhbT0ibm9uZSIsY2V4Um93PTAuNixjZXhDb2w9MC42LCBtYWluPSJDb3JyZWxhdGlvbiBtYXRyaXgiKQ0KYGBgDQoNCg0KIyBSZWZlcmVuY2VzDQoNCg0KDQpbMV06IGh0dHBzOi8vd3d3LmthZ2dsZS5jb20vanNwaHlnL3dlYXRoZXItZGF0YXNldC1yYXR0bGUtcGFja2FnZT4gIllvdW5nLCBKLiAoMjAxNywgRGVjZW1iZXIpIFJhaW4gaW4gQXVzdHJhbGlhOiBQcmVkaWN0IG5leHQtZGF5IHJhaW4gaW4gQXVzdHJhbGlhLCBWZXJzaW9uIDIuIFJldHJpZXZlZCAxOCBTZXB0ZW1iZXIgMjAyMS4iDQpbMl06IGh0dHA6Ly93d3cuYm9tLmdvdi5hdS9jbGltYXRlL2RhdGEvICJDb21tb253ZWFsdGggb2YgQXVzdHJhbGlhIEJ1cmVhdSBvZiBNZXRlb3JvbG9neSAyMDIxLCBhY2Nlc3NlZCAxOSBTZXB0ZW1iZXIgMjAyMSINCg==